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Abstract 


Cryptographic operations like hashing and signing need the data to be expressed in an invariant 
format so that the operations are reliably repeatable. One way to address this is to create a 
canonical representation of the data. Canonicalization also permits data to be exchanged in its 
original form on the "wire" while cryptographic operations performed on the canonicalized 
counterpart of the data in the producer and consumer endpoints generate consistent results. 


This document describes the JSON Canonicalization Scheme (JCS). This specification defines how 
to create a canonical representation of JSON data by building on the strict serialization methods 
for JSON primitives defined by ECMAScript, constraining JSON data to the Internet JSON (I-JSON) 
subset, and by using deterministic property sorting. 


Status of This Memo 


This document is not an Internet Standards Track specification; it is published for informational 
purposes. 


This is a contribution to the RFC Series, independently of any other RFC stream. The RFC Editor 
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implementation or deployment. Documents approved for publication by the RFC Editor are not 
candidates for any level of Internet Standard; see Section 2 of RFC 7841. 


Information about the current status of this document, any errata, and how to provide feedback 
on it may be obtained at https://www.rfc-editor.org/info/rfc8785. 
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1. Introduction 


This document describes the JSON Canonicalization Scheme (JCS). This specification defines how 
to create a canonical representation of JSON [RFC8259] data by building on the strict serialization 
methods for JSON primitives defined by ECMAScript [ECMA-262], constraining JSON data to the I- 
JSON [RFC7493] subset, and by using deterministic property sorting. The output from JCS is a 
"hashable" representation of JSON data that can be used by cryptographic methods. The 
subsequent paragraphs outline the primary design considerations. 


Cryptographic operations like hashing and signing need the data to be expressed in an invariant 
format so that the operations are reliably repeatable. One way to accomplish this is to convert 
the data into a format that has a simple and fixed representation, like base64url [RFC4648]. This 
is how JSON Web Signature (WS) [RFC7515] addressed this issue. Another solution is to create a 
canonical version of the data, similar to what was done for the XML signature [XMLDSIG] 
standard. 


The primary advantage with a canonicalizing scheme is that data can be kept in its original form. 
This is the core rationale behind JCS. Put another way, using canonicalization enables a JSON 
object to remain a JSON object even after being signed. This can simplify system design, 
documentation, and logging. 


To avoid "reinventing the wheel", JCS relies on the serialization of JSON primitives (strings, 
numbers, and literals), as defined by ECMAScript (aka JavaScript) [ECMA-262] beginning with 
version 6. 


Seasoned XML developers may recall difficulties getting XML signatures to validate. This was 
usually due to different interpretations of the quite intricate XML canonicalization rules as well 
as of the equally complex Web Services security standards. The reasons why JCS should not 
suffer from similar issues are: 


e JSON does not have a namespace concept and default values. 


e Data is constrained to the I-JSON [RFC7493] subset. This eliminates the need for specific 
parsers for dealing with canonicalization. 


e JCS-compatible serialization of JSON primitives is currently supported by most web browsers 
as well as by Node.js [NODEJS]. 


« The full JCS specification is currently supported by multiple open-source implementations 
(see Appendix G). See also Appendix F for implementation guidelines. 
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JCS is compatible with some existing systems relying on JSON canonicalization such as JSON Web 
Key (WK) Thumbprint [RFC7638] and Keybase [KEYBASE]. 


For potential uses outside of cryptography, see [[SONCOMP]. 


The intended audiences of this document are JSON tool vendors as well as designers of JSON- 
based cryptographic solutions. The reader is assumed to be knowledgeable in ECMAScript, 
including the "JSON" object. 


2. Terminology 


Note that this document is not on the IETF standards track. However, a conformant 
implementation is supposed to adhere to the specified behavior for security and interoperability 
reasons. This text uses BCP 14 to describe that necessary behavior. 


The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD 
NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to 
be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in 
all capitals, as shown here. 


3. Detailed Operation 


This section describes the details related to creating a canonical JSON representation and how 
they are addressed by JCS. 


Appendix F describes the RECOMMENDED way of adding JCS support to existing JSON tools. 


3.1. Creation of Input Data 


Data to be canonically serialized is usually created by: 


e Parsing previously generated JSON data. 
e Programmatically creating data. 


Irrespective of the method used, the data to be serialized MUST be adapted for I-JSON [RFC7493] 
formatting, which implies the following: 


« JSON objects MUST NOT exhibit duplicate property names. 

e JSON string data MUST be expressible as Unicode [UNICODE]. 

« JSON number data MUST be expressible as IEEE 754 [[EEE754] double-precision values. For 
applications needing higher precision or longer integers than offered by IEEE 754 double 
precision, it is RECOMMENDED to represent such numbers as JSON strings; see Appendix D 
for details on how this can be performed in an interoperable and extensible way. 


An additional constraint is that parsed JSON string data MUST NOT be altered during subsequent 
serializations. For more information, see Appendix E. 


Rundgren, et al. Informational Page 4 


RFC 8785 JSON Canonicalization Scheme June 2020 


Note: Although the Unicode standard offers the possibility of rearranging certain character 
sequences, referred to as "Unicode Normalization" [UCNORM], JCS-compliant string processing 
does not take this into consideration. That is, all components involved in a scheme depending on 
JCS MUST preserve Unicode string data "as is". 


3.2. Generation of Canonical JSON Data 


The following subsections describe the steps required to create a canonical JSON representation 
of the data elaborated on in the previous section. 


Appendix A shows sample code for an ECMAScript-based canonicalizer, matching the JCS 
specification. 


3.2.1. Whitespace 
Whitespace between JSON tokens MUST NOT be emitted. 


3.2.2. Serialization of Primitive Data Types 


Assume the following JSON object is parsed: 


{ 
"numbers": [333333333 .33333329, 1E30, 4.50, 
2e-3, 0.000000000000000000000000001|, 
"string": "\u2@acS\u@@GF\ueeGaA' \UBE42\UGE22\UB@5c\\\"\/", 
"literals": [null, true, false] 


If the parsed data is subsequently serialized using a serializer compliant with ECMAScript's 
"JSON.stringifyQ", the result would (with a line wrap added for display purposes only) be rather 
divergent with respect to the original data: 


{"numbers" : [333333333 .3333333, 1e+30,4.5,8.002,1e-27], "string": 
"€S\u@OOf\nA'B\"\\\\\"/", "literals" :[null, true, false] } 


The reason for the difference between the parsed data and its serialized counterpart is due to a 
wide tolerance on input data (as defined by JSON [RFC8259]), while output data (as defined by 
ECMAScript) has a fixed representation. As can be seen in the example, numbers are subject to 
rounding as well. 


The following subsections describe the serialization of primitive JSON data types according to 
JCS. This part is identical to that of ECMAScript. In the (unlikely) event that a future version of 
ECMAScript would invalidate any of the following serialization methods, it will be up to the 
developer community to either stick to this specification or create a new specification. 


3.2.2.1. Serialization of Literals 


In accordance with JSON [RFC8259], the literals "null", "true", and "false" MUST be serialized as 
null, true, and false, respectively. 


Rundgren, et al. Informational Page 5 


RFC 8785 JSON Canonicalization Scheme June 2020 


3.2.2.2. Serialization of Strings 


For JSON string data (which includes JSON object property names as well), each Unicode code 
point MUST be serialized as described below (see Section 24.3.2.2 of [ECMA-262]): 


e If the Unicode value falls within the traditional ASCII control character range (U+0000 
through U+001F), it MUST be serialized using lowercase hexadecimal Unicode notation 
(\uhhhh) unless it is in the set of predefined JSON control characters U+0008, U+0009, U 
+000A, U+000C, or U+000D, which MUST be serialized as \b, \t, \n, \f, and \r, respectively. 

e If the Unicode value is outside of the ASCII control character range, it MUST be serialized 
"as is" unless it is equivalent to U+005C (\) or U+0022 ("), which MUST be serialized as \\ and \", 
respectively. 


Finally, the resulting sequence of Unicode code points MUST be enclosed in double quotes ("). 


Note: Since invalid Unicode data like "lone surrogates" (e.g., U+DEAD) may lead to 
interoperability issues including broken signatures, occurrences of such data MUST cause a 
compliant JCS implementation to terminate with an appropriate error. 


3.2.2.3. Serialization of Numbers 

ECMAScript builds on the IEEE 754 [IEEE754] double-precision standard for representing JSON 
number data. Such data MUST be serialized according to Section 7.1.12.1 of [ECMA-262], including 
the "Note 2" enhancement. 


Due to the relative complexity of this part, the algorithm itself is not included in this document. 
For implementers of JCS-compliant number serialization, Google's implementation in V8 [V8] 
may serve as a reference. Another compatible number serialization reference implementation is 
Ryu [RYU], which is used by the JCS open-source Java implementation mentioned in Appendix G. 
Appendix B holds a set of IEEE 754 sample values and their corresponding JSON serialization. 


Note: Since Not a Number (NaN) and Infinity are not permitted in JSON, occurrences of NaN or 
Infinity MUST cause a compliant JCS implementation to terminate with an appropriate error. 
3.2.3. Sorting of Object Properties 


Although the previous step normalized the representation of primitive JSON data types, the 
result would not yet qualify as "canonical" since JSON object properties are not in lexicographic 
(alphabetical) order. 


Applied to the sample in Section 3.2.2, a properly canonicalized version should (with a line wrap 
added for display purposes only) read as: 


{"literals":[null, true, false], "numbers" :[333333333 .3333333, 
1e+30,4.5,0.002,1e-27], "string" :"€S\u@O@OF\nA'B\"\\\\\"/"} 
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The rules for lexicographic sorting of JSON object properties according to JCS are as follows: 


e JSON object properties MUST be sorted recursively, which means that JSON child Objects 
MUST have their properties sorted as well. 


e JSON array data MUST also be scanned for the presence of JSON objects (if an object is found, 
then its properties MUST be sorted), but array element order MUST NOT be changed. 


When a JSON object is about to have its properties sorted, the following measures MUST be 
adhered to: 


e The sorting process is applied to property name strings in their "raw" (unescaped) form. That 
is, a newline character is treated as U+000A. 


* Property name strings to be sorted are formatted as arrays of UTF-16 [UNICODE] code units. 
The sorting is based on pure value comparisons, where code units are treated as unsigned 
integers, independent of locale settings. 


e Property name strings either have different values at some index that is a valid index for 
both strings, or their lengths are different, or both. If they have different values at one or 
more index positions, let k be the smallest such index; then, the string whose value at 
position k has the smaller value, as determined by using the "<" operator, lexicographically 
precedes the other string. If there is no index position at which they differ, then the shorter 
string lexicographically precedes the longer string. 


In plain English, this means that property names are sorted in ascending order like the 
following: 


"aq" 
ub 


The rationale for basing the sorting algorithm on UTF-16 code units is that it maps directly to the 
string type in ECMAScript (featured in web browsers and Node.js), Java, and .NET. In addition, 
JSON only supports escape sequences expressed as UTF-16 code units, making knowledge and 
handling of such data a necessity anyway. Systems using another internal representation of 
string data will need to convert JSON property name strings into arrays of UTF-16 code units 
before sorting. The conversion from UTF-8 or UTF-32 to UTF-16 is defined by the Unicode 
[UNICODE] standard. 
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The following JSON test data can be used for verifying the correctness of the sorting scheme in a 


JCS implementation: 


{ 
"\u2@ac": "Euro Sign", 
"Nr": "Carriage Return", 
"\ufb33": "Hebrew Letter Dalet With Dagesh", 
alke "One", 


"\ud83d\ude@@": "Emoji: Grinning Face", 
"\u8088": "Control", 
"\u@0f6": "Latin Small Letter O With Diaeresis" 


Expected argument order after sorting property strings: 


"Carriage Return" 

"One" 

"Control" 

"Latin Small Letter O With Diaeresis" 
"Euro Sign" 

"Emoji: Grinning Face" 

"Hebrew Letter Dalet With Dagesh" 


Note: For the purpose of obtaining a deterministic property order, sorting of data encoded in 

UTF-8 or UTF-32 would also work, but the outcome for JSON data like above would differ and 
thus be incompatible with this specification. However, in practice, property names are rarely 
defined outside of 7-bit ASCII, making it possible to sort string data in UTF-8 or UTF-32 format 
without conversion to UTF-16 and still be compatible with JCS. Whether or not this is a viable 


option depends on the environment JCS is used in. 


3.2.4. UTF-8 Generation 


Finally, in order to create a platform-independent representation, the result of the preceding step 


MUST be encoded in UTF-8. 


Applied to the sample in Section 3.2.3, this should yield the following bytes, here shown in 


hexadecimal notation: 


7D 22 66 169)/4 65) 7261 6c. 73) 225 3a, 5b: 66 75) 66 OC 2C 
75 65 2c 66 61 6c 73 65 5d 2c 22 be 75 6d 62 65 72 73 
50 33 33 33 33 33 33 33 33 33 2e 33 33 33 33 33 33 33 
65 2b 33 30 2c 34 2e 35 2c 30 2e 30 30 32 2c 31 65 2d 
5d 2c 22 73 74 72 69 6e 67 22 3a 22 e2 82 ac 24 5c 75 
30166 Se 08 Ain 27 2 Se 22 ge ge 50 Se go 22 2f 227d 


This data is intended to be usable as input to cryptographic methods. 


4. IANA Considerations 


This document has no IANA actions. 
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5. Security Considerations 


It is crucial to perform sanity checks on input data to avoid overflowing buffers and similar 
things that could affect the integrity of the system. 


When JCS is applied to signature schemes like the one described in Appendix F, applications 
MUST perform the following operations before acting upon received data: 


1. Parse the JSON data and verify that it adheres to I-JSON. 


2. Verify the data for correctness according to the conventions defined by the ecosystem where 
it is to be used. This also includes locating the property holding the signature data. 


3. Verify the signature. 


If any of these steps fail, the operation in progress MUST be aborted. 
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Appendix A. ECMAScript Sample Canonicalizer 


Below is an example of a JCS canonicalizer for usage with ECMAScript-based systems: 
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LET ELE EEE EE LEE EEE 
// Since the primary purpose of this code is highlighting // 
// the core of the JCS algorithm, error handling and // 
// UTF-8 generation were not implemented. ip 
TET TATE TATA TET ELTA ALATA AAA AAA AAT AAAALATIT LAL 
var canonicalize = function(object) ( 

var buffer = '': 

serialize(object); 

return buffer; 


function serialize(object) ( 
if (object === null || typeof object !== "object" || 
object.toJSON != null) ( 
AA AT | TLS TL STOTT A TAL AD ATTY, 
// Primitive type or toJSON, use "JSON" // 
TELE LETS ELEL SR ITT TET SETTLES TELL ELLIE 
buffer += JSON.stringify(object) ; 


} else if (Array.isArray(object)) { 
VO 


// Array - Maintain element order // 
TEN ED TEL ELLE ALLELE TET ETP EEE 
buffer += '['; 


let next = false; 
object.forEach((element) => { 
if (next) { 
buffer += ','; 
) 


next = true; 

(EEE ALe EE Lae AEE AEA 
// Array element - Recursive expansion // 
TIL TLL TET] TERIA AGU EL PIGIEIEI EISELE ALL TALE 
serialize(element) ; 


+); 
buffer += ']'; 


} else { 

EIE LE EEE Eee 
// Object - Sort properties before serializing // 
TT ETICE AE Ad Ao hk pdf Ba EO FAN PN PN EN RJ OKA Å 
buffer += '{'; 

let next = false: 
Object.keys(object).sort().forEach((property) => { 

if (next) { 
buffer es 


next = true; 
VE 
// Property names are strings, use "JSON" // 
VE 
buffer += JSON.stringify(property); 

buffer += ':'; 

O OI ALAA ALAA AAA AAT 

// Property value - Recursive expansion // 
LILILOUA ITAU 
serialize(object[property]); 
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buffer += '}'; 


} 
| 


Appendix B. Number Serialization Samples 


The following table holds a set of ECMAScript-compatible number serialization samples, 


including some edge cases. The column "IEEE 754" refers to the internal ECMAScript 
representation of the "Number" data type, which is based on the IEEE 754 [IEEE754] standard 
using 64-bit (double-precision) values, here expressed in hexadecimal. 


IEEE 754 


0000000000000000 


8000000000000000 


0000000000000001 


8000000000000001 


TfeffFFFFFFFFFFE 


FreffFFFFFFFFFFF 


4340000000000000 


c340000000000000 


4430000000000000 


TfffFFFFFFFFFFFF 


7ff000000000000O 


44b52d02c7e14af5 


44b52d02c7e14af6 


44b52d02c7e14af7 


444b1ae4d6e2ef4e 


444b1ae4d6e2ef4f 


444b1ae4d6e2ef 50 
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JSON Representation 
O 
O 
5e-324 
-5e-324 
1.7976931348623157e+308 
-1.79769313486231 57e+308 
9007199254740992 
-9007199254740992 


295147905179352830000 


9.999999999999997e+22 
1e+23 
1.0000000000000001e+23 
999999999999999700008 
999999999999999900008 


1e+21 


Informational 


Comment 


Zero 
Minus ze 
Min pos 
Min neg 
Max pos 
Max neg 
Max pos 
Max neg 
~2**68 
NaN 


Infinity 


ro 


number 


number 


number 


number 


int 


int 


(1) 
(1) 
(2) 
(3) 
(3) 
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IEEE 754 


3ebØc6f7aØb5ed8c 


3ebØc6f7aØb5ed8d 


41b3de4355555553 


41b3de4355555554 


41b3de4355555555 


41b3de4355555556 


41b3de4355555557 


becbf647612f3696 


43143 ff3c1cb0959 
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JSON Representation 
9.999999999999997e-7 
@.800001 
333333333 . 3333332 
3333333337 33333325 
333333333 . 3333333 
3333333333333334 
333333333 . 33333343 
-@ .0000033333333333333333 


1424953923781206 .2 


June 2020 


Comment 


Round to even (4) 


Table 1: ECMAScript-Compatible JSON Number Serialization Samples 


Notes: 


(1) For maximum compliance with the ECMAScript "JSON" object, values that are to be 
interpreted as true integers SHOULD be in the range -9007199254740991 to 
9007199254740991. However, how numbers are used in applications does not affect the JCS 


algorithm. 


(2) Although a set of specific integers like 2**68 could be regarded as having extended 
precision, the JCS/ECMAScript number serialization algorithm does not take this into 


consideration. 


(3) Values out of range are not permitted in JSON. See Section 3.2.2.3. 


(4 This number is exactly 1424953923781206.25 but will, after the "Note 2" rule mentioned in 
Section 3.2.2.3, be truncated and rounded to the closest even value. 


For a more exhaustive validation of a JCS number serializer, you may test against a file 

(currently) available in the development portal (see Appendix I) containing a large set of sample 
values. Another option is running V8 [V8] as a live reference together with a program generating 
a substantial amount of random IEEE 754 values. 


Appendix C. Canonicalized JSON as "Wire Format" 


Since the result from the canonicalization process (see Section 3.2.4) is fully valid JSON, it can 
also be used as "Wire Format". However, this is just an option since cryptographic schemes based 
on JCS, in most cases, would not depend on that externally supplied JSON data already being 


canonicalized. 


Rundgren, et al. 


Informational 


Page 14 


RFC 8785 JSON Canonicalization Scheme June 2020 


In fact, the ECMAScript standard way of serializing objects using "JSON.stringify0" produces a 
more "logical" format, where properties are kept in the order they were created or received. The 
example below shows an address record that could benefit from ECMAScript standard 
serialization: 


"name": "John Doe", 

"address": "2008 Sunset Boulevard", 
"city": "Los Angeles", 

"zip": "90001", 

vetater: EAR 


Using canonicalization, the properties above would be output in the order "address", "city", 
"name", "state", and "zip", which adds fuzziness to the data from a human (developer or technical 
support) perspective. Canonicalization also converts JSON data into a single line of text, which 


may be less than ideal for debugging and logging. 


Appendix D. Dealing with Big Numbers 


There are several issues associated with the JSON number type, here illustrated by the following 
sample object: 


{ 
"giantNumber": 1.4e+9999, 
"payMeThis": 26000.33, 
"int64Max": 9223372036854775807 


) 


Although the sample above conforms to JSON [RFC8259], applications would normally use 
different native data types for storing "giantNumber" and "int64Max". In addition, monetary data 
like "payMeThis" would presumably not rely on floating-point data types due to rounding issues 
with respect to decimal arithmetic. 


The established way of handling this kind of "overloading" of the JSON number type (at least in 
an extensible manner) is through mapping mechanisms, instructing parsers what to do with 
different properties based on their name. However, this greatly limits the value of using the JSON 
number type outside of its original, somewhat constrained JavaScript context. The ECMAScript 
"JSON" object does not support mappings to the JSON number type either. 


Due to the above, numbers that do not have a natural place in the current JSON ecosystem MUST 
be wrapped using the JSON string type. This is close to a de facto standard for open systems. This 
is also applicable for other data types that do not have direct support in JSON, like "DateTime" 
objects as described in Appendix E. 
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Aided by a system using the JSON string type, be it programmatic like 


var obj = JSON.parse('{"giantNumber": "1.4e+9999"}'); 
var biggie = new BigNumber(obj.giantNumber) ; 


or declarative schemes like OpenAPI [OPENAPI], JCS imposes no limits on applications, including 
when using ECMAScript. 


Appendix E. String Subtype Handling 


Due to the limited set of data types featured in JSON, the JSON string type is commonly used for 
holding subtypes. This can, depending on JSON parsing method, lead to interoperability 
problems, which MUST be dealt with by JCS-compliant applications targeting a wider audience. 


Assume you want to parse a JSON object where the schema designer assigned the property "big" 
for holding a "BigInt" subtype and "time" for holding a "DateTime" subtype, while "val" is 
supposed to be a JSON number compliant with JCS. The following example shows such an object: 


"time": "2019-01-28T07:45:10Z", 
abage 0550 
avale 668 

} 


Parsing of this object can be accomplished by the following ECMAScript statement: 
var object = JSON.parse(JSON_object_featured_as_a_string) ; 


After parsing, the actual data can be extracted, which for subtypes, also involves a conversion 
step using the result of the parsing process (an ECMAScript object) as input: 


new Date(object.time); // Date object 
BigInt(object.big) ; // Big integer 
object.val; // JSON/JS number 


Note that the "BigInt" data type is currently only natively supported by V8 [V8]. 


Canonicalization of "object" using the sample code in Appendix A would return the following 
string: 


("big":"Ø55", "time" :"2019-@1-28T@7 :45:10Z", "val" :3.5} 
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Although this is (with respect to JCS) technically correct, there is another way of parsing JSON 
data, which also can be used with ECMAScript as shown below: 


// "BigInt" requires the following code to become JSON serializable 
BigInt.prototype.toJSON = function() { 
return this.toString(); 


// JSON parsing using a "stream"-based method 
var object = JSON.parse(JSON_object_featured_as_a_string, 
(k,v) => k == 'time' ? new Date(v) : == 'big' ? BigInt(v) : v 


If you now apply the canonicalizer in Appendix A to "object", the following string would be 
generated: 


{"big":"55", "time" :"2019-@1-28T@7 :45:10.000Z", "val" :3.5} 


In this case, the string arguments for "big" and "time" have changed with respect to the original, 
presumably making an application depending on JCS fail. 


The reason for the deviation is that in stream- and schema-based JSON parsers, the original 
string argument is typically replaced on the fly by the native subtype that, when serialized, may 
exhibit a different and platform-dependent pattern. 


That is, stream- and schema-based parsing MUST treat subtypes as "pure" (immutable) JSON 
string types and perform the actual conversion to the designated native type in a subsequent 
step. In modern programming platforms like Go, Java, and C#, this can be achieved with 
moderate efforts by combining annotations, getters, and setters. Below is an example in C#/ 
Json.NET showing a part of a class that is serializable as a JSON object: 


// The "pure" string solution uses a local 

// string variable for JSON serialization while 
// exposing another type to the application 

[ JsonProperty("amount" ) ] 

private string _amount; 


[ JsonIgnore] 

public decimal Amount { 
get { return decimal.Parse( amount); } 
set { _amount = value.ToString(); } 


In an application, "Amount" can be accessed as any other property while it is actually 
represented by a quoted string in JSON contexts. 
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Note: The example above also addresses the constraints on numeric data implied by I-JSON (the 
C# "decimal" data type has quite different characteristics compared to IEEE 754 double 
precision). 


E.1. Subtypes in Arrays 


Since the JSON array construct permits mixing arbitrary JSON data types, custom parsing and 
serialization code may be required to cope with subtypes anyway. 


Appendix F. Implementation Guidelines 


The optimal solution is integrating support for JCS directly in JSON serializers (parsers need no 
changes). That is, canonicalization would just be an additional "mode" for a JSON serializer. 
However, this is currently not the case. Fortunately, JCS support can be introduced through 
externally supplied canonicalizer software acting as a post processor to existing JSON serializers. 
This arrangement also relieves the JCS implementer from having to deal with how underlying 
data is to be represented in JSON. 


The post processor concept enables signature creation schemes like the following: 


1. Create the data to be signed. 
2. Serialize the data using existing JSON tools. 


3. Let the external canonicalizer process the serialized data and return canonicalized result 
data. 


4. Sign the canonicalized data. 

5. Add the resulting signature value to the original JSON data through a designated signature 
property. 

6. Serialize the completed (now signed) JSON object using existing JSON tools. 


A compatible signature verification scheme would then be as follows: 


1. Parse the signed JSON data using existing JSON tools. 

2. Read and save the signature value from the designated signature property. 
3. Remove the signature property from the parsed JSON object. 

4. Serialize the remaining JSON data using existing JSON tools. 


5. Let the external canonicalizer process the serialized data and return canonicalized result 
data. 


6. Verify that the canonicalized data matches the saved signature value using the algorithm 
and key used for creating the signature. 


A canonicalizer like above is effectively only a "filter", potentially usable with a multitude of 
quite different cryptographic schemes. 


Using a JSON serializer with integrated JCS support, the serialization performed before the 
canonicalization step could be eliminated for both processes. 
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Appendix G. Open-Source Implementations 


The following open-source implementations have been verified to be compatible with JCS: 


e JavaScript: <https://www.npmjs.com/package/canonicalize> 

* Java: <https://github.com/erdtman/java-json-canonicalization> 

e Go: <https://github.com/cyberphone/json-canonicalization/tree/master/go> 

e .NET/C#: <https://github.com/cyberphone/json-canonicalization/tree/master/dotnet> 
« Python: <https://github.com/cyberphone/json-canonicalization/tree/master/python3> 


Appendix H. Other JSON Canonicalization Efforts 


There are (and have been) other efforts creating "Canonical JSON". Below is a list of URLs to some 
of them: 


* <https://tools.ietf.org/html/draft-staykov-hu-json-canonical-form-00> 

* <https://gibson042.github.io/canonicaljson-spec/> 

« <http://wiki.laptop.org/go/Canonical JSON> 
The listed efforts all build on text-level JSON-to-JSON transformations. The primary feature of 
text-level canonicalization is that it can be made neutral to the flavor of JSON used. However, 
such schemes also imply major changes to the JSON parsing process, which is a likely hurdle for 


adoption. Albeit at the expense of certain JSON and application constraints, JCS was designed to 
be compatible with existing JSON tools. 


Appendix I. Development Portal 
The JCS specification is currently developed at: <https://github.com/cyberphone/ietf-json-canon>. 


JCS source code and extensive test data is available at: <https://github.com/cyberphone/json- 
canonicalization>. 
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